NSF PAR Search | NSF Public Access Repository

Note: When clicking on a Digital Object Identifier (DOI) number, you will be taken to an external site maintained by the publisher. Some full text articles may not yet be available without a charge during the embargo (administrative interval).
What is a DOI Number?

Some links on this page may take you to non-federal websites. Their policies may differ from this site.

Non Disruptive Disruption: An Empirical Experience of Introducing LLMs in the SOC

Hahn, Francis; Mamoon, Mohd; Bardas, Alexandru G; Collins, Michael; Dudek, Jaclyn; Lende, Daniel; Ou, Xinming; Rajagopalan, SR (February 2026, The Internet Society)

Security Operations Centers (SOCs) are high-stress, time-critical environments in which analysts manage multiple concurrent tasks and depend heavily on both technical expertise and effective communication. This paper examines the integration of Large Language Model (LLM) technologies into an operational SOC using an anthropological, fieldwork-based approach. Over a six-month period, two computer science graduate researchers were embedded within a corporate SOC, guided by an internal advocate, to observe workflows and assess organizational responses to emerging technologies. We began with an initial demonstration of an LLM-based incident response tool, followed by sustained participant observation and fieldwork within the incident response and vulnerability management teams. Drawing on these insights, we co-developed and deployed an LLM-based SOC companion platform supporting root cause analysis, query construction, and asset discovery. Continued in-situ observation was used to evaluate its impact on analyst practices. Our findings show that anthropological and sociotechnical approaches, coupled with practitioner co-creation, can enable the nondisruptive introduction of LLM companion tools by closely aligning development with existing SOC workflows.
more » « less
Full Text Available
Using LLM Embeddings with Similarity Search for Botnet TLS Certificate Detection

https://doi.org/10.1145/3689932.3694766

Shashwat, Kumar; Hahn, Francis; Millar, Stuart; Ou, Xinming (November 2024, ACM)

Full Text Available
An Analysis of the Role of Situated Learning in Starting a Security Culture in a Software Company

Tuladhar, Anwesh; Lende, Daniel; Ligatti, Jay; Ou, Xinming (August 2021, Seventeenth Symposium on Usable Privacy and Security (SOUPS 2021))
null (Ed.)
We conducted an ethnographic study of a software development company to explore if and how a development team adopts security practices into the development lifecycle. A PhD student in computer science with prior training in qualitative research methods was embedded in the company for eight months. The researcher joined the company as a software engineer and participated in all development activities as a new hire would, while also making observations on the development practices. During the fieldwork, we observed a positive shift in the development team's practices regarding secure development. Our analysis of data indicates that the shift can be attributed to enabling all software engineers to see how security knowledge could be applied to the specific software products they worked on. We also observed that by working with other developers to apply security knowledge under the concrete context where the software products were built, developers who possessed security expertise and wanted to push for more secure development practices (security advocates) could be effective in achieving this goal. Our data point to an interactive learning process where software engineers in a development team acquire knowledge, apply it in practice, and contribute to the team, leading to the creation of a set of preferred practices, or "culture" of the team. This learning process can be understood through the lens of the situated learning framework, where it is recognized that knowledge transfer happens within a community of practice, and applying the knowledge is the key in individuals (software engineers) acquiring it and the community (development team) embodying such knowledge in its practice. Our data show that enabling a situated learning environment for security gives rise to security-aware software engineers. We discuss the roles of management and security advocates in driving the learning process to start a security culture in a software company.
more » « less
Full Text Available
An Ethnographic Understanding of Software (In)Security and a Co-Creation Model to Improve Secure Software Development

Palombo, Hernan; Ziaie Tabari, Armin; Lende, Daniel; Ligatti, Jay; Ou, Xinming (August 2020, Proceedings of the Sixteenth Symposium on Usable Privacy and Security)

We present an ethnographic study of secure software development processes in a software company using the anthropological research method of participant observation. Two PhD students in computer science trained in qualitative methods were embedded in a software company for 1.5 years of total research time. The researchers participated in everyday work activities such as coding and meetings, and observed software (in)security phenomena both through investigating historical data (code repositories and ticketing system records), and through pen-testing the developed software and observing developers’ and management’s reactions to the discovered vulnerabilities. Our study found that 1) security vulnerabilities are sometimes intentionally introduced and/or overlooked due to the difficulty in managing the various stakeholders’ responsibilities in an economic ecosystem, and cannot be simply blamed on developers’ lack of knowledge or skills; 2) accidental vulnerabilities discovered in the pen-testing process produce different reactions in the development team, often times contrary to what a security researcher would predict. These findings highlight the nuanced nature of the root causes of software vulnerabilities and indicate the need to take into account a significant amount of contextual information to understand how and why software vulnerabilities emerge during software development. Rather than simply addressing deficits in developer knowledge or practice, this research sheds light on at times forgotten human factors that significantly impact the security of software developed by actual companies. Our analysis also shows that improving software security in the development process can benefit from a co-creation model, where security experts work side by side with software developers to better identify security concerns and provide tools that are readily applicable within the specific context of the software development workflow.
more » « less
Full Text Available
Experimental Study of Machine Learning based Malware Detection Systems’ Practical Utility

Li, Yuping; Caragea, Doina; Hall, Lawrence; Ou, Xinming (January 2020, HICSS SYMPOSIUM ON CYBERSECURITY BIG DATA ANALYTICS)

Thanks to the numerous machine learning based malware detection (MLMD) research in recent years and the readily available online malware scanning system (e.g., VirusTotal), it becomes relatively easy to build a seemingly successful MLMD system using the following standard procedure: first prepare a set of ground truth data by checking with VirusTotal, then extract features from training dataset and build a machine learning detection model, and finally evaluate the model with a disjoint testing dataset. We argue that such evaluation methods do not expose the real utility of ML based malware detection in practice since the ML model is both built and tested on malware that are known at the time of training. The user could simply run them through VirusTotal just as how the researchers obtained the ground truth, instead of using the more sophisticated ML approach. However, ML based malware detection has the potential of identifying malware that has not been known at the time of training, which is the real value ML brings to this problem. We present experimentation study on how well a machine learning based malware detection system can achieve this. Our experiments showed that MLMD can consistently generate previously unknown malware knowledge, e.g., malware that is not detectable by existing malware detection systems at MLMD’s training time. Our research illustrates an ideal usage scenario for MLMD systems and demonstrates that such systems can benefit malware detection in practice. For example, by utilizing the new signals provided by the MLMD system and the detection capability of existing malware detection systems, we can more quickly uncover new malware variants or families.
more » « less
Full Text Available
Experimental Study of Machine Learning based Malware Detection Systems’ Practical Utility

Li, Yuping; Caragea, Doina; Hall, Lawrence; Ou, Xinming (January 2020, HICSS SYMPOSIUM ON CYBERSECURITY BIG DATA ANALYTICS)
null (Ed.)
Full Text Available
Hybrid Analysis of Android Apps for Security Vetting using Deep Learning

Chaulagain, Dewan; Poudel, Prabesh; Pathak, Prabesh; Roy, Sankardas; Caragea, Doina; Liu, Guojun; Ou, Xinming (July 2020, 2020 IEEE Conference on Communications and Network Security (CNS))

The phenomenal growth in use of android devices in the recent years has also been accompanied by the rise of android malware. This reality warrants development of tools and techniques to analyze android apps in large scale for security vetting. Most of the state-of-the-art vetting tools are either based on static analysis or on dynamic analysis. Static analysis has limited success if the malware app utilizes sophisticated evading tricks. Dynamic analysis on the other hand may not find all the code execution paths, which let some malware apps remain undetected. Moreover, the existing static and dynamic analysis vetting techniques require extensive human interaction. To ad- dress the above issues, we design a deep learning based hybrid analysis technique, which combines the complementary strengths of each analysis paradigm to attain better accuracy. Moreover, automated feature engineering capability of the deep learning framework addresses the human interaction issue. In particular, using lightweight static and dynamic analysis procedure, we obtain multiple artifacts, and with these artifacts we train the deep learner to create independent models, and then combine them to build a hybrid classifier to obtain the final vetting decision (malicious apps vs. benign apps). The experiments show that our best deep learning model with hybrid analysis achieves an area under the precision-recall curve (AUC) of 0.9998. In this paper, we also present a comparative study of performance measures of the various variants of the deep learning framework. Additional experiments indicate that our vetting system is fairly robust against imbalanced data and is scalable.
more » « less
Full Text Available
GPU-Based Static Data-Flow Analysis for Fast and Scalable Android App Vetting

https://doi.org/10.1109/IPDPS47924.2020.00037

Yu, Xiaodong; Wei, Fengguo; Ou, Xinming; Becchi, Michela; Bicer, Tekin; Yao, Danfeng Daphne (May 2020, 2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS))

Many popular vetting tools for Android applications use static code analysis techniques. In particular, Inter-procedural Data-Flow Graph (IDFG) construction is the computation at the core of Android static data-flow analysis and consumes most of the analysis time. Many analysis tools use a worklist algorithm, an iterative fixed-point approach, to construct the IDFG. In this paper, we observe that a straightforward GPU parallelization of the worklist algorithm leads to significant underutilization of the GPU resources. We identify four performance bottlenecks, namely, frequent dynamic memory allocations, high branch divergence, workload imbalance, and irregular memory access patterns. Accordingly, we propose GDroid, a GPU-based worklist algorithm implementation with multiple fine-grained optimizations tailored to common characteristics of Android applications. The optimizations considered are: matrix-based data structure, memory access-based node grouping, and worklist merging. Our experimental evaluation, performed on 1000 Android applications, shows that the proposed optimizations are beneficial to performance, and GDroid can achieve up to 128X speedups against a plain GPU implementation.
more » « less
Full Text Available
Hybrid Analysis of Android Apps for Security Vetting using Deep Learning

https://doi.org/10.1109/CNS48642.2020.9162341

Chaulagain, Dewan; Poudel, Prabesh; Pathak, Prabesh; Roy, Sankardas; Caragea, Doina; Liu, Guojun; Ou, Xinming (June 2020, 2020 IEEE Conference on Communications and Network Security (CNS))
null (Ed.)
Full Text Available
JN-SAF: Precise and Efficient NDK/JNI-aware Inter-language Static Analysis Framework for Security Vetting of Android Applications with Native Code

https://doi.org/10.1145/3243734.3243835

Wei, Fengguo; Lin, Xingwei; Ou, Xinming; Chen, Ting; Zhang, Xiaosong (October 2018, Proceedings of the 2018 ACM SIGSAC Conference on Computer and Communications Security)

Full Text Available

« Prev Next »

Search for: All records